A Comparative Study of Web-pages Classification Methods using Fuzzy Operators Applied to Arabic Web-pages

نویسندگان

  • Hua - Jun Zeng
  • Benyu Zhang
  • Wei - Ying Ma Yuchang Lu
  • S. T. Dumais
  • D. Michie
  • D. J. Spiegelhalter
  • David M. Pennock
  • David bell Hue Wang
چکیده

In this study, a fuzzy similarity approach for Arabic web pages classification is presented. The approach uses a fuzzy term-category relation by manipulating membership degree for the training data and the degree value for a test web page. Six measures are used and compared in this study. These measures include: Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and Bounded Difference approaches. These measures are applied and compared using 50 different Arabic web-pages. Einstein measure was gave best performance among the other measures. An analysis of these measures and concluding remarks are drawn in this study. Keywords— Text classification, HTML, Web pages, Machine learning, Fuzzy logic, Arabic Web pages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Web-pages Classification Methods using Fuzzy Operators Applied to Arabic Web-pages

In this study, a fuzzy similarity approach for Arabic web pages classification is presented. The approach uses a fuzzy term-category relation by manipulating membership degree for the training data and the degree value for a test web page. Six measures are used and compared in this study. These measures include: Einstein, Algebraic, Hamacher, MinMax, Special case fuzzy and Bounded Difference ap...

متن کامل

Arabic WebPages Classification Based On Fuzzy Association

Information retrieval from web documents becomes a challenge according to the exponential growth in the number of web pages on the Internet and rapid changes in these pages. So, it is necessary to classify web pages into classes to provide their results for the applied tools to be used which makes information retrieval easier and helps in facilitating applications use on the internet. Web pages...

متن کامل

Analyzing new features of infected web content in detection of malicious web pages

Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...

متن کامل

A Technique for Improving Web Mining using Enhanced Genetic Algorithm

World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...

متن کامل

ارزیابی کیفیت صفحات‌ وب پژوهشگاه‌های وابسته به وزارت علوم، تحقیقات و فن‌آوری‌ مستقر در شهر تهران از دیدگاه کاربران

Especially in research centers, evaluating the quality of web pages from clients' point of view has a constructive role in their design and development, since it makes the web developers familiar with client's perspective and assists them in designing client-oriented web sites in scientific and research environment. As a model for assessing the quality of web pages, "webQual" attempts to provid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009